首页> 外文OA文献 >An Iterative Scheme for the Approximate Linear Programming Solution to the Optimal Control of a Markov Decision Process

【2h】

An Iterative Scheme for the Approximate Linear Programming Solution to the Optimal Control of a Markov Decision Process

机译：马尔可夫决策过程最优控制的近似线性规划解的迭代方案

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper addresses the computational issues involved in the solution to an infinite-horizon optimal control problem for a Markov Decision Process (MDP) with a continuous state component and a discrete control input. The optimal Markov policy for the MDP can be determined based on the fixed point solution to the Bellman equation, which can be rephrased as a constrained Linear Program (LP) with an infinite number of constraints and an infinite dimensional optimization variable (the optimal value function). To compute an (approximate) solution to the LP, an iterative randomized scheme is proposed where the optimization variable is expressed as a linear combination of basis functions in a given class: at each iteration, the resulting semi-infinite LP is solved via constraint sampling, whereas the number of basis functions is progressively increased through the iterations so as to meet some performance goal. The effectiveness of the proposed scheme is shown on a multi-room heating system example.

机译：本文解决了具有连续状态分量和离散控制输入的马尔可夫决策过程（MDP）的无限水平最优控制问题解决方案所涉及的计算问题。可以基于Bellman方程的不动点解来确定MDP的最佳马尔可夫策略，可以将其改写为具有无限数量的约束和无限维优化变量的约束线性程序（LP）（最优值函数）。为了计算LP的（近似）解，提出了一种迭代随机方案，其中，优化变量表示为给定类中基函数的线性组合：在每次迭代中，通过约束采样来解决所得的半无限LP。，而基础函数的数量则通过迭代逐渐增加，以满足某些性能目标。所提出的方案的有效性在多房间供暖系统示例中得到了证明。

著录项

作者
Falsone, Alessandro; Prandini, Maria;
展开▼
作者单位

展开▼
年度 2015
总页数
原文格式 PDF
正文语种 eng
中图分类

相似文献

外文文献
中文文献
专利

1. Policy Search for the Optimal Control of Markov Decision Processes: A Novel Particle-Based Iterative Scheme [J] . Giorgio Manganini, Matteo Pirotta, Marcello Restelli, Cybernetics, IEEE Transactions on . 2016,第11期

机译：Markov决策过程的最优控制的策略搜索：一种新型的基于粒子的迭代方案
2. Optimally solving Markov decision processes with total expected discounted reward function: Linear programming revisited [J] . Oguzhan Alagoz, Mehmet U.S. Ayvaci, Jeffrey T. Linderoth Computers & Industrial Engineering . 2015,第sepa期

机译：使用总预期折现报酬函数优化求解马尔可夫决策过程：重新考虑线性规划
3. LINEAR PROGRAMMING AND CONSTRAINED AVERAGE OPTIMALITY FOR GENERAL CONTINUOUS-TIME MARKOV DECISION PROCESSES IN HISTORY-DEPENDENT POLICIES [J] . XIANPING GUO, YONGHUI HUANG, XINYUAN SONG SIAM Journal on Control and Optimization . 2012,第1期

机译：历史相关策略中一般连续时间马尔可夫决策过程的线性规划和约束平均最优性
4. An iterative scheme for the approximate linear programming solution to the optimal control of a Markov Decision Process [C] . Falsone Alessandro, Prandini Maria European Control Conference . 2015

机译：马尔可夫决策过程最优控制的近似线性规划解的迭代方案
5. Markov Decision Processes and Approximate Dynamic Programming Methods for Optimal Treatment Design [D] . Mason, Jennifer Elizabeth 2012

机译：马尔可夫决策过程和近似动态规划方法进行最优处理设计
6. Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task [O] . Ken Kinjo, Eiji Uchibe, Kenji Doya 2013

机译：动态模型学习在移动机器人导航任务中线性可解马尔可夫决策过程的评估
7. Policy Search for the Optimal Control of Markov Decision Processes:udA Novel Particle-Based Iterative Scheme [O] . Manganini Giorgio, Pirotta Matteo, Restelli Marcello, 2016

机译：Markov决策过程的最优控制的策略搜索： ud一种新颖的基于粒子的迭代方案

An Iterative Scheme for the Approximate Linear Programming Solution to the Optimal Control of a Markov Decision Process

摘要

著录项

相似文献

相关主题

期刊订阅